top performer
Towards Unbounded Machine Unlearning
Deep machine unlearning is the problem of'removing' from a trained neural network a subset of its training set. This problem is very timely and has many applications, including the key tasks of removing biases (RB), resolving confusion (RC) (caused by mislabelled data in trained models), as well as allowing users to exercise their'right to be forgotten' to protect User Privacy (UP). This paper is the first, to our knowledge, to study unlearning for different applications (RB, RC, UP), with the view that each has its own desiderata, definitions for'forgetting' and associated metrics for forget quality. For UP, we propose a novel adaptation of a strong Membership Inference Attack for unlearning. We also propose SCRUB, a novel unlearning algorithm, which is the only method that is consistently a top performer for forget quality across the different application-dependent metrics for RB, RC, and UP. At the same time, SCRUB is also consistently a top performer on metrics that measure model utility (i.e.
Instructional Goal-Aligned Question Generation for Student Evaluation in Virtual Lab Settings: How Closely Do LLMs Actually Align?
Knipper, R. Alexander, Dey, Indrani, Sarkar, Souvika, Narayanan, Hari, Puntambekar, Sadhana, Karmaker, Santu
Virtual Labs offer valuable opportunities for hands-on, inquiry-based science learning, yet teachers often struggle to adapt them to fit their instructional goals. Third-party materials may not align with classroom needs, and developing custom resources can be time-consuming and difficult to scale. Recent advances in Large Language Models (LLMs) offer a promising avenue for addressing these limitations. In this paper, we introduce a novel alignment framework for instructional goal-aligned question generation, enabling teachers to leverage LLMs to produce simulation-aligned, pedagogically meaningful questions through natural language interaction. The framework integrates four components: instructional goal understanding via teacher-LLM dialogue, lab understanding via knowledge unit and relationship analysis, a question taxonomy for structuring cognitive and pedagogical intent, and the TELeR taxonomy for controlling prompt detail. Early design choices were informed by a small teacher-assisted case study, while our final evaluation analyzed over 1,100 questions from 19 open-source LLMs. With goal and lab understanding grounding questions in teacher intent and simulation context, the question taxonomy elevates cognitive demand (open-ended formats and relational types raise quality by 0.29-0.39 points), and optimized TELeR prompts enhance format adherence (80% parsability, >90% adherence). Larger models yield the strongest gains: parsability +37.1%, adherence +25.7%, and average quality +0.8 Likert points.
- Asia > Middle East > Jordan (0.04)
- Europe > Middle East > Malta > Northern Region > Western District > Attard (0.04)
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- (9 more...)
- Education > Educational Setting > K-12 Education (0.93)
- Education > Curriculum > Subject-Specific Education (0.67)
found
We thank the reviewers for their very constructive feedback! In contrast, our method's gradient is: Put simply, Kiryo et al. stop optimizing Empirically, we find that this "soft-constraint" approach to implausible negative risk yields comparable or better models We also show in the supplementals (e.g., Sec. PU learning work (including Kiryo et al. in their nnPU paper), which uses neural networks. However, our experiments show that PUc's biggest limitation is not its representation: On unshifted data (Table 1 row 1 On shifted data (Tab. 1 's performance degrades while our methods' performance improves. We will add a "Discussion" subsection to the paper's "Experimental Results" (Sec.
Towards Unbounded Machine Unlearning
Deep machine unlearning is the problem of'removing' from a trained neural network a subset of its training set. This problem is very timely and has many applications, including the key tasks of removing biases (RB), resolving confusion (RC) (caused by mislabelled data in trained models), as well as allowing users to exercise their'right to be forgotten' to protect User Privacy (UP). This paper is the first, to our knowledge, to study unlearning for different applications (RB, RC, UP), with the view that each has its own desiderata, definitions for'forgetting' and associated metrics for forget quality. For UP, we propose a novel adaptation of a strong Membership Inference Attack for unlearning. We also propose SCRUB, a novel unlearning algorithm, which is the only method that is consistently a top performer for forget quality across the different application-dependent metrics for RB, RC, and UP.
D-NLP at SemEval-2024 Task 2: Evaluating Clinical Inference Capabilities of Large Language Models
Large language models (LLMs) have garnered significant attention and widespread usage due to their impressive performance in various tasks. However, they are not without their own set of challenges, including issues such as hallucinations, factual inconsistencies, and limitations in numerical-quantitative reasoning. Evaluating LLMs in miscellaneous reasoning tasks remains an active area of research. Prior to the breakthrough of LLMs, Transformers had already proven successful in the medical domain, effectively employed for various natural language understanding (NLU) tasks. Following this trend, LLMs have also been trained and utilized in the medical domain, raising concerns regarding factual accuracy, adherence to safety protocols, and inherent limitations. In this paper, we focus on evaluating the natural language inference capabilities of popular open-source and closed-source LLMs using clinical trial reports as the dataset. We present the performance results of each LLM and further analyze their performance on a development set, particularly focusing on challenging instances that involve medical abbreviations and require numerical-quantitative reasoning. Gemini, our leading LLM, achieved a test set F1-score of 0.748, securing the ninth position on the task scoreboard. Our work is the first of its kind, offering a thorough examination of the inference capabilities of LLMs within the medical domain.
- North America > Canada > Ontario > Toronto (0.05)
- North America > United States (0.04)
- Asia > Middle East > Yemen > Amran Governorate > Amran (0.04)
A Methodology for Questionnaire Analysis: Insights through Cluster Analysis of an Investor Competition Data
Forster, Carlos Henrique Q., de Castro, Paulo André Lima, Ramalho, Andrei
In this paper, we propose a methodology for the analysis of questionnaire data along with its application on discovering insights from investor data motivated by a day trading competition. The questionnaire includes categorical questions, which are reduced to binary questions, 'yes' or 'no'. The methodology reduces dimensionality by grouping questions and participants with similar responses using clustering analysis. Rule discovery was performed by using a conversion rate metric. Innovative visual representations were proposed to validate the cluster analysis and the relation discovery between questions. When crossing with financial data, additional insights were revealed related to the recognized clusters.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- South America > Brazil (0.04)
- North America > United States > New York (0.04)
- Europe > Montenegro (0.04)
- Research Report (1.00)
- Questionnaire & Opinion Survey (1.00)
PGA Tour makes schedule changes in response to LIV Golf's rise, including more designated events with no cuts
Fox News Flash top sports headlines are here. Check out what's clicking on Foxnews.com. The PGA Tour is making major changes to its schedule and how several of its events are played as LIV Golf's second season gets underway. The PGA Tour ratified a motion Tuesday that reduces fields for eight designated events in 2024 to between 70 and 80 golfers with no 36-hole cuts. The tour has not announced which events will be affected, but the majors, the FedEx Cup Playoffs and the Players Championship will not be included in the changes.
Comprehensive Guide to Model Selection
To help navigate the abundant options for creating a machine learning model, we recommend a five step process that results in a useful data product. We recommend starting off with Human Centered Design-empowered approach. HCD focuses on the challenges faced by the end user and uses a framework to make decisions up-front that will guide the remainder of the model selection process. This helps keep the data scientist working toward resolving the business problem -- not getting mired in technical difficulties. With the learnings from the HCD phase in mind, the data scientist next conducts a survey of the landscape of models that could tackle the business challenge.
Cobots not robots: the future of sales
Life has been tough for sales people raised on face to face meetings and networking. Yet while some have railed against the lack of personal contact over the past 18 months, digitally empowered individuals have gone from strength to strength, tapping into deep online data and fast video access to prospects to transform the sales process. Digital tools, including AI and cobots, have changed the nature of sales for good. They are supporting sales people by identifying the most likely deals with more accuracy, reducing wasted time and improving conversion. They are defining the most successful approach for each prospect engagement, allowing individuals to move away from a standard, restrictive, sales methodology.